近年来,破坏预测取得了迅速的进展,尤其是在机器学习(ML)的方法中。理解为什么预测因子使某个预测与未来Tokamak破坏预测指标的预测准确性一样至关重要。大多数破坏预测因素的目的是准确性或跨机能力。但是,如果可以解释中断预测模型,则可以说明为什么某些样品被归类为中断前体。这使我们能够说出传入的破坏类型,并使我们深入了解破坏机制。本文根据J-TEXT上的物理引导特征提取(IDP-PGFE)设计了一种称为可解释的破坏预测变量的破坏预测变量。通过提取物理引导的特征有效地改善了模型的预测性能。需要高性能模型来确保解释结果的有效性。 IDP-PGFE的可解释性研究提供了对J-Text破坏的理解,并且通常与现有的破坏理解一致。 IDP-PGFE已被应用于破坏,因为在J文本上的密度极限实验的密度不断增加。 PGFE的时间演变具有贡献,表明ECRH的应用触发了辐射引起的破坏,从而降低了破坏时的密度。虽然RMP的应用确实提高了J文本中的密度极限。解释性研究指导了RMP不仅会影响MHD不稳定性,而且还会影响辐射轮廓的密度极限破坏的物理机制,从而延迟了密度极限的破坏。
translated by 谷歌翻译
预测不同托卡马克人的破坏是要克服的巨大障碍。未来的Tokamaks在高性能排放时几乎无法忍受中断。很少有高性能的破坏排放几乎无法构成丰富的训练集,这使得当前数据驱动的方法难以获得可接受的结果。能够将在一个Tokamak训练的中断预测模型转移到另一种训练的机器学习方法以解决该问题。关键是一个包含特征提取器的破坏预测模型,该模型能够在Tokamak诊断数据中提取常见的破坏前体痕迹,并具有可转移的破坏分类器。基于上面的问题,该论文首先提出了专门针对Tokamaks上的普通诊断中的破坏前体特征而设计的深融合功能提取器,该特征是根据当前已知的破坏前体,为可转移模型提供了有希望的基础。通过与J-Text上的手动特征提取进行比较,可以证明融合功能提取器。基于在J-TEXT上训练的功能提取器,将中断预测模型转移到East数据中,仅来自East实验的20次放电。该性能与经过1896年出院的模型相当。从其他模型培训方案之间的比较,转移学习表明了其在预测不同托卡马克人的破坏方面的潜力。
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译
This paper focuses on designing efficient models with low parameters and FLOPs for dense predictions. Even though CNN-based lightweight methods have achieved stunning results after years of research, trading-off model accuracy and constrained resources still need further improvements. This work rethinks the essential unity of efficient Inverted Residual Block in MobileNetv2 and effective Transformer in ViT, inductively abstracting a general concept of Meta-Mobile Block, and we argue that the specific instantiation is very important to model performance though sharing the same framework. Motivated by this phenomenon, we deduce a simple yet efficient modern \textbf{I}nverted \textbf{R}esidual \textbf{M}obile \textbf{B}lock (iRMB) for mobile applications, which absorbs CNN-like efficiency to model short-distance dependency and Transformer-like dynamic modeling capability to learn long-distance interactions. Furthermore, we design a ResNet-like 4-phase \textbf{E}fficient \textbf{MO}del (EMO) based only on a series of iRMBs for dense applications. Massive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, \eg, our EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass \textbf{SoTA} CNN-/Transformer-based models, while trading-off the model accuracy and efficiency well.
translated by 谷歌翻译
Despite significant progress in object categorization, in recent years, a number of important challenges remain; mainly, the ability to learn from limited labeled data and to recognize object classes within large, potentially open, set of labels. Zero-shot learning is one way of addressing these challenges, but it has only been shown to work with limited sized class vocabularies and typically requires separation between supervised and unsupervised classes, allowing former to inform the latter but not vice versa. We propose the notion of vocabulary-informed learning to alleviate the above mentioned challenges and address problems of supervised, zero-shot, generalized zero-shot and open set recognition using a unified framework. Specifically, we propose a weighted maximum margin framework for semantic manifold-based recognition that incorporates distance constraints from (both supervised and unsupervised) vocabulary atoms. Distance constraints ensure that labeled samples are projected closer to their correct prototypes, in the embedding space, than to others. We illustrate that resulting model shows improvements in supervised, zero-shot, generalized zero-shot, and large open set recognition, with up to 310K class vocabulary on Animal with Attributes and ImageNet datasets.
translated by 谷歌翻译
Deploying reliable deep learning techniques in interdisciplinary applications needs learned models to output accurate and ({even more importantly}) explainable predictions. Existing approaches typically explicate network outputs in a post-hoc fashion, under an implicit assumption that faithful explanations come from accurate predictions/classifications. We have an opposite claim that explanations boost (or even determine) classification. That is, end-to-end learning of explanation factors to augment discriminative representation extraction could be a more intuitive strategy to inversely assure fine-grained explainability, e.g., in those neuroimaging and neuroscience studies with high-dimensional data containing noisy, redundant, and task-irrelevant information. In this paper, we propose such an explainable geometric deep network dubbed as NeuroExplainer, with applications to uncover altered infant cortical development patterns associated with preterm birth. Given fundamental cortical attributes as network input, our NeuroExplainer adopts a hierarchical attention-decoding framework to learn fine-grained attentions and respective discriminative representations to accurately recognize preterm infants from term-born infants at term-equivalent age. NeuroExplainer learns the hierarchical attention-decoding modules under subject-level weak supervision coupled with targeted regularizers deduced from domain knowledge regarding brain development. These prior-guided constraints implicitly maximizes the explainability metrics (i.e., fidelity, sparsity, and stability) in network training, driving the learned network to output detailed explanations and accurate classifications. Experimental results on the public dHCP benchmark suggest that NeuroExplainer led to quantitatively reliable explanation results that are qualitatively consistent with representative neuroimaging studies.
translated by 谷歌翻译
Improving the visual quality of the given degraded observation by correcting exposure level is a fundamental task in the computer vision community. Existing works commonly lack adaptability towards unknown scenes because of the data-driven patterns (deep networks) and limited regularization (traditional optimization), and they usually need time-consuming inference. These two points heavily limit their practicability. In this paper, we establish a Practical Exposure Corrector (PEC) that assembles the characteristics of efficiency and performance. To be concrete, we rethink the exposure correction to provide a linear solution with exposure-sensitive compensation. Around generating the compensation, we introduce an exposure adversarial function as the key engine to fully extract valuable information from the observation. By applying the defined function, we construct a segmented shrinkage iterative scheme to generate the desired compensation. Its shrinkage nature supplies powerful support for algorithmic stability and robustness. Extensive experimental evaluations fully reveal the superiority of our proposed PEC. The code is available at https://rsliu.tech/PEC.
translated by 谷歌翻译
Sparse principal component analysis (SPCA) has been widely used for dimensionality reduction and feature extraction in high-dimensional data analysis. Despite there are many methodological and theoretical developments in the past two decades, the theoretical guarantees of the popular SPCA algorithm proposed by Zou, Hastie & Tibshirani (2006) based on the elastic net are still unknown. We aim to close this important theoretical gap in this paper. We first revisit the SPCA algorithm of Zou et al. (2006) and present our implementation. Also, we study a computationally more efficient variant of the SPCA algorithm in Zou et al. (2006) that can be considered as the limiting case of SPCA. We provide the guarantees of convergence to a stationary point for both algorithms. We prove that, under a sparse spiked covariance model, both algorithms can recover the principal subspace consistently under mild regularity conditions. We show that their estimation error bounds match the best available bounds of existing works or the minimax rates up to some logarithmic factors. Moreover, we demonstrate the numerical performance of both algorithms in simulation studies.
translated by 谷歌翻译
During training, reinforcement learning systems interact with the world without considering the safety of their actions. When deployed into the real world, such systems can be dangerous and cause harm to their surroundings. Often, dangerous situations can be mitigated by defining a set of rules that the system should not violate under any conditions. For example, in robot navigation, one safety rule would be to avoid colliding with surrounding objects and people. In this work, we define safety rules in terms of the relationships between the agent and objects and use them to prevent reinforcement learning systems from performing potentially harmful actions. We propose a new safe epsilon-greedy algorithm that uses safety rules to override agents' actions if they are considered to be unsafe. In our experiments, we show that a safe epsilon-greedy policy significantly increases the safety of the agent during training, improves the learning efficiency resulting in much faster convergence, and achieves better performance than the base model.
translated by 谷歌翻译
Due to the issue that existing wireless sensor network (WSN)-based anomaly detection methods only consider and analyze temporal features, in this paper, a self-supervised learning-based anomaly node detection method based on an autoencoder is designed. This method integrates temporal WSN data flow feature extraction, spatial position feature extraction and intermodal WSN correlation feature extraction into the design of the autoencoder to make full use of the spatial and temporal information of the WSN for anomaly detection. First, a fully connected network is used to extract the temporal features of nodes by considering a single mode from a local spatial perspective. Second, a graph neural network (GNN) is used to introduce the WSN topology from a global spatial perspective for anomaly detection and extract the spatial and temporal features of the data flows of nodes and their neighbors by considering a single mode. Then, the adaptive fusion method involving weighted summation is used to extract the relevant features between different models. In addition, this paper introduces a gated recurrent unit (GRU) to solve the long-term dependence problem of the time dimension. Eventually, the reconstructed output of the decoder and the hidden layer representation of the autoencoder are fed into a fully connected network to calculate the anomaly probability of the current system. Since the spatial feature extraction operation is advanced, the designed method can be applied to the task of large-scale network anomaly detection by adding a clustering operation. Experiments show that the designed method outperforms the baselines, and the F1 score reaches 90.6%, which is 5.2% higher than those of the existing anomaly detection methods based on unsupervised reconstruction and prediction. Code and model are available at https://github.com/GuetYe/anomaly_detection/GLSL
translated by 谷歌翻译